Migrate to the reusable tox workflow #1102

kurtmckee · 2024-11-06T12:08:37Z

This PR migrates the SDK to the reusable tox workflow.

Metrics	Before this PR	This PR (cache miss)	This PR (cache hit)
Total duration	2m 0s	2m 5s	1m 46
Total run time	7m 51s	5m 42s	4m 9s

📚 Documentation preview 📚: https://globus-sdk-python--1102.org.readthedocs.build/en/1102/

sirosen · 2024-11-07T05:10:32Z

.github/workflows/test.yaml

+          -
+            - "requirements/*/*.txt"
+            - "pyproject.toml"
+            - "toxfile.py"


I have been wondering, over the past week, about whether or not tox-uv's faster venv building makes it unnecessary to cache the .tox dir contents. As long as the uv action's cache is populated, .tox/ can be quickly rebuilt.
One of the things I wonder is whether or not the balance between the two may, in fact, favor rebuilding over caching (since caching and hashing take some time).

I'm curious if you've given this any thought?

I have. Caching the tarballs and wheels, instead of caching everything that was installed, hasn't previously been faster.

The numbers are borne out best on Windows, so I'll share from the feedparser logs, which tests the highest and lowest supported CPython versions (and which I recommend doing here, but didn't introduce in this PR).

Here's the timings reported by feedparser tests for Windows with a cache miss:

py3.9-chardet: OK (45.76=setup[7.22]+cmd[38.55] seconds) py3.13-chardet: OK (41.80=setup[9.69]+cmd[32.11] seconds) congratulations :) (87.68 seconds)

and for a cache hit:

py3.9-chardet: OK (42.22=setup[3.74]+cmd[38.48] seconds) py3.13-chardet: OK (31.66=setup[0.21]+cmd[31.45] seconds) congratulations :) (74.01 seconds)

(Note that the first tox environment always has the wheel build step counted in as a part of its setup.) Since the cmd times per tox environment are within ~0.5s of each other between the cache-miss and cache-hit executions, I'm more inclined to trust that the setup times aren't simply GitHub runner jitter.

So, my interpretation is that this is a win of ~13 seconds across 2 tox environments on Windows.

It took 1 second to look up the cache and miss, and then 5 seconds to upload the cache from the cache-miss job; it subsequently took 2 seconds to download the cache for the cache-hit job, which is an additional ~4 seconds won.

I have consistently found that it's faster to cache what's installed, rather than caching what needs to be installed. tox-uv makes environment creation and package installation fast, but I don't think it's fast enough.

You're welcome to try improving on this! It's mechanically trivial, but extremely time-consuming. Here's the steps:

Create a branch off this project (or my own workflow repo)

Point a second project with a "significant" test suite at the new branch

Repeatedly push and force-push to the second project, possibly manually deleting the caches, and keep switching back to the workflow project to make and push changes.

I find this explanation 110% satisfactory. I'm probably not going to experiment with this at least within the next few days: my main question was about the comparison between time(cache miss + tox_uv setup + cache save) vs time(cache hit + tox_uv setup) and you've already provided numbers for that.

I am willing to accept some minor regressions in CI speeds if it gives us other improvements (e.g., workflow simplicity). In particular, I've been trying to track in the PRs as you've converted us over to the new workflow -- what exactly is being used for cache keys and is it "correct"?
The uv action cache carries all of the raw packages already (in the runner's homedir), so there's some interesting interplay there with the .tox dir.

Thanks for laying this all out for me!

Oh, I think I see what you're referring to. This isn't using the uv GitHub action, so there's no side caching happening, and pip caching isn't enabled for the setup-python action.

For cache keys, here's the rule that I've generally been following:

Already-included files

These files are always included by the reusable workflow:

.python-identifiers

(generated by the kurtmckee/detect-pythons action; ensures that the cache -- which contains symlinks to Python interpreter executables -- is invalidated if the Python versions change)

.workflow-config.json

(ensures that changes to the requested configuration invalidates the cache)

tox.ini

(ensures that changes to the tox configuration invalidates the cache)

Files you should use with cache-key-hash-files

In general, any files that contain tool configuration directives should be hashed for cache-busting.

pyproject.toml

mypy.ini

.flake8

.pre-commit-config.yaml

setup.cfg

requirements/*/*.txt

poetry.lock

If these files change, it can indicate that different dependencies should be installed, or that a tool like mypy should change how it's writing its own cache, or any number of other things that might make the workflow cache less useful.

Rename the test workflow to better reflect its purpose

48f38e2

kurtmckee added the no-news-is-good-news This change does not require a news file label Nov 6, 2024

kurtmckee self-assigned this Nov 6, 2024

kurtmckee force-pushed the reusable-tox-workflow branch 7 times, most recently from 35590ba to 7459db7 Compare November 6, 2024 15:39

Migrate to a reusable workflow

8ab0996

kurtmckee force-pushed the reusable-tox-workflow branch from 7459db7 to 8ab0996 Compare November 6, 2024 15:39

kurtmckee marked this pull request as ready for review November 6, 2024 15:43

kurtmckee requested review from aaschaer, ada-globus, derek-globus, m1yag1, MaxTueckeGlobus and sirosen as code owners November 6, 2024 15:43

sirosen approved these changes Nov 7, 2024

View reviewed changes

kurtmckee merged commit 0b981a8 into main Nov 7, 2024
7 checks passed

kurtmckee deleted the reusable-tox-workflow branch November 7, 2024 14:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Migrate to the reusable tox workflow #1102

Migrate to the reusable tox workflow #1102

kurtmckee commented Nov 6, 2024 •

edited

Loading

sirosen Nov 7, 2024

kurtmckee Nov 7, 2024

sirosen Nov 7, 2024

kurtmckee Nov 7, 2024 •

edited

Loading

Migrate to the reusable tox workflow #1102

Migrate to the reusable tox workflow #1102

Conversation

kurtmckee commented Nov 6, 2024 • edited Loading

sirosen Nov 7, 2024

Choose a reason for hiding this comment

kurtmckee Nov 7, 2024

Choose a reason for hiding this comment

sirosen Nov 7, 2024

Choose a reason for hiding this comment

kurtmckee Nov 7, 2024 • edited Loading

Choose a reason for hiding this comment

Already-included files

Files you should use with cache-key-hash-files

kurtmckee commented Nov 6, 2024 •

edited

Loading

kurtmckee Nov 7, 2024 •

edited

Loading

Files you should use with `cache-key-hash-files`